Overview 🔍

This vignette walks you through the spot calling process of different channels and imaging cycles from MERFISH.

To that end, I need to perform several subtasks: (1) establish parameters and parse necessary metadata, (2) load and process images, and perform floor-ceiling calculations, (3) decode pixel intensity vectors using the MERFISH codebook, (3) optimize the filtering of low quality spots on a single field of view (FOV), and finally (4) run the optimized pipeline on all FOVs.

  1. Establish parameters and parse necessary metadata 🔅:
library(masmr)
library(ggplot2)
library(reshape2)
options(java.parameters = "-Xmx8g")

data_dir <- "~/JIN_SIM/20190718_WK_T2_a_B2_R8_12_d50/"
out_dir <- "~/OUT"

establishParams(
  imageDirs         =  paste0(data_dir, 'S10/'),
  imageFormat       = ".ome.tif",
  codebookFileName  = paste0(data_dir, "codebook/Jin_codebook_B.tsv"),
  codebookColumnNames = paste0(rep(c(0:3), 4), '_', rep(c('cy3', 'alexa594', 'cy5', 'cy7'), each=4) ),
  outDir            = out_dir,
  dapiDir           = paste0(data_dir, "DAPI_1/"),
  fpkmFileName      = paste0(data_dir, "FPKM_data/Jin_FPKMData_B.tsv"),
  nProbesFileName   = paste0(data_dir, "ProbesPerGene.csv"),
  verbose           = TRUE
)
## Assuming .ome.tif is extension for meta files...
## resumeMode = TRUE...
## 100 images across 16 cycles ( 4 ON-bit codes )...

I check that the global parameters are succesfully updated.

ls( envir = params )
##  [1] "codebook_colnames"         "codebook_file"             "dapi_dir"                  "dapi_format"               "dir_names"                
##  [6] "fov_names"                 "fpkm_file"                 "im_dir"                    "im_format"                 "inferred_codebookColnames"
## [11] "meta_format"               "n_probes_file"             "nbits"                     "out_dir"                   "parent_out_dir"           
## [16] "python_location"           "raw_codebook"              "resumeMode"                "seed"                      "verbose"
head(params$raw_codebook)

I then read in image metadata like resolution, and fov names, and ensure that the bits in the codebook match the order of images loaded by readImageMetaData(). Additionally, the latter function adds 52 “fake” “new-blanks” that are simply binary encodings that can be still made given that there is still space to encode 52 other “genes” that would be separated by the hamming distance used in the codebook.

readImageMetaData()
## Warning in readImageMetaData(): Updating params$fov_names
prepareCodebook()
## 
## Codebook has 57 blanks and 94 genes...
  1. Check the format of .ome.tif files 📱:

Masmr has a function readImage() that can read-in ome.tif files using RBioformats package, or .dax file using base R. In my experiment, a single .ome.tif file has has 4 channels, one z-slice, 2900 x 2900 dimensions. Since I have 4 folders titled “hyb00_1”, “hyb01_1”, “hyb02_1”, and “hyb03_1” - I can encode 16 bits in my codebook. In a lot of cases, the image format will be different between datasets and you should check out the documentation section 3.21, 3.3 and 8 for more details.

params$current_fov <- params$fov_names[38]
print(params$current_fov)
file <- unique( params$global_coords[params$global_coords$fov==params$fov_names, 'image_file'] )[1]
series <- which( params$fov_names == params$current_fov )

im <- readImage( 
  file, 
  nchannels = 4, # The image has 4 channels total
  nzs = 1,
  series = series )

print( length(im) ) #4 channels

imsub <- im[[2]] # choosing the second channel
plot(imager::as.cimg( imsub ), main = paste0('Image dimensions: ', paste(dim(imsub), collapse = ' x ') ) )
## Warning in as.cimg.array(imsub): Assuming third dimension corresponds to time/depth

## [1] "MMStack_1-Pos_002_003"
## [1] 4

For MERFISH, we typically have to handle multiple images at once. Thus, masmr has an additional image reading function called readImageList() that loads all images for a given FOV, by cross-referencing params$global_coords to find all relevant files. And because readImageList does this cross-referencing, parameters such as nchannels, nzs etc do not have to be specified.

imList <- readImageList( params$current_fov )
## 
## Reading images...
## 1 of 4...2 of 4...3 of 4...4 of 4...
par(mfrow=c(4,4))
for(n in names(imList)){
  plot(imager::as.cimg(sqrt(imList[[n]])), main = n) # applying sqrt transformation for visualization
}

  1. Establish global image parameters 📱:

As a first step in our pipeline, it is recommended to determine the dynamic range of intensity values: i.e. the brightness value for below which (i.e. minimum brightness / floor) and above which (i.e. maximum brightness / ceiling) I can safely ignore gradation. A sensible minimum brightness would be intensities for background pixels where no tissue is imaged. For maximum brightness, I want a cap where all super-threshold intensities can effectively be considered ‘ON’. Ideally, these limits should be relatively consistent across FOVs.

Function getAnchorParams() loads multiple FOVs and attempts to get global image parameters. As default, brightness_max and brightness_min are returned. For details, see the documentation for getAnchorParams().

params$verbose = F
getAnchorParams( anchorMode = 'grid', returnTroubleShootPlots = T, freshAnchors = F )

To evaluate the suitability of chosen thresholds, users can refer to the troubleshootPlots variable returned when returnTroubleShootPlots is true. Clearly, we are able to capture a spike in intensity values on the left of the histogram that is autoflourescent signal.

metric = 'brightness_min'
bit_number = 1
print( troubleshootPlots[[ metric ]][[ bit_number ]] )
## NULL

Lastly, I establish a clean slate for looping over images, clearing memory of intermediate objects from memory.

establishCleanSlate()
##  [1] ".Random.seed"      "bit_number"        "data_dir"          "file"              "im"                "imList"            "imsub"            
##  [8] "metric"            "n"                 "out_dir"           "params"            "series"            "troubleshootPlots"
  1. Decode pixels using the MERFISH cookbook 📕:

After figuring out floor and ceiling values, I need to optimize the filtering on a single non-empty FOV. Choose an FOV that is at least 50% non-empty.

I run masmr’s default image processing pipeline on this FOV that separates spot signal from autoflourescent background in my .ome.tif images. The different functions are used for different parts of the masmr image processing pipeline and can be custom designed by the user. For example, the function “DECODE” is optimized for decoding bright spots and comparign them to your gene cookbook.

params$verbose <- FALSE
## Get image metrics
getImageMetrics(imList,
                imageFunctions = list(
                  'ORIG' = imReturn,
                  'NORM' = imWinsorIntensities,
                  'COARSE' = imLowPass,
                  'DECODE' = imForDecode,
                  'MASK' = imForMask
                ))
## 
par(mfrow=c(2,3))
for( imMetricName in ls(imMetrics) ){
  imsub <- imMetrics[[imMetricName]][[1]]
  plot(imager::as.cimg(imsub[1000:1500, 1000:1500, 1] ), main = imMetricName)
}
par(mfrow=c(1,1))

I then register the images so hybridization rounds are comparable visually using registerImages(). Images of the same FOV are rarely perfect overlaps of each other due to imprecise microscope movements or optical distortions. The function works by maximising cross-correlation between two images (in Fourier transformed space).

params$verbose <- FALSE
forReg <- list()
for(i in 1:length(imList)){
  forReg[[i]] <- imNormalise( imMetrics$COARSE[[i]] + imMetrics$DECODE[[i]] )
}
names(forReg) <- names(imList)
registerImages(forReg, returnTroubleShootPlots = T)
## Warning in maxIntensityProject(imsub): 2D matrix provided...Skipping MIP...
p1 <- troubleshootPlots[['REGISTRATION_EVAL']][[1]][['PRE_REGISTER']] + ggtitle('Before registration')
p2 <- troubleshootPlots[['REGISTRATION_EVAL']][[1]][['POST_REGISTER']] + ggtitle('After registration')

cowplot::plot_grid( plotlist = list(p1,p2))

Having aligned and processed our images, the next step is to compile our images into a manageable (memory-wise) dataframe with a unified coordinate system, in which pixels are rows and image metrics are columns.

params$verbose <- FALSE
spotcalldf <- consolidateImageMetrics()

head( spotcalldf )

Thirdly, I decode spots using my cookbook. Decoding involves calculating distances between image metric vectors and expected vectors from the codebook. For example, the table belows shows how pixels that have love cosine distance are confidently assigned to specific genes like MKI67, while blanks have significantly higher cosine distances.

params$verbose <- FALSE
spotcalldf <- decode(spotcalldf)
head(spotcalldf[,grepl('^g$|LAB$', colnames(spotcalldf))])

The final step involves classifying pixels as either genuine gene signals or “blanks” to control for false positive detections, since blank barcodes in the codebook serve as negative controls for spurious signals. The scoreBlanks() function uses distance metrics and additional heuristics to distinguish real gene expression from random noise or imaging artifacts, ensuring data quality by preventing false signals from contaminating the final dataset.

params$verbose <- FALSE
spotcalldf$BLANK <- scoreBlanks( spotcalldf ) >0
  1. Optimize pixel filtering for single FOV ➡️:

I can now determine optimal cutoffs for signal processing by empirically by finding the threshold that separates blanks from non-blank calls. Here, I select the average difference between the 50th quantile (median) that is TRUE (Blank), and FALSE (Non-Blank) using function thresh_Quantile().

filtout <- rep(0, nrow(spotcalldf))
for( distanceMetric in c('COS', 'EUC', 'L2E') ){
  thresh <- thresh_Quantile( spotcalldf[,distanceMetric],
                             label = spotcalldf$BLANK,
                             quantileFalse = 0.5, quantileTrue = 0.5 )
  cat(paste0( '\n', distanceMetric, ' threshold: ', thresh ))
  filtout <- filtout + spotcalldf[,distanceMetric] > thresh
}

ggplot( reshape2::melt(spotcalldf, measure.vars=c('COS', 'EUC', 'L2E'))  ) +
  geom_violin(aes(x=BLANK, y=value, colour=BLANK) ) +
  geom_boxplot(aes(x=BLANK, y=value, colour=BLANK), fill='transparent', width=0.25 ) +
  facet_wrap(~variable, scales='free') +
  ylab('') +
  theme_minimal(base_size=14) +
  scale_colour_manual(values=c('TRUE' = 'red', 'FALSE' = 'black')) +
  theme(legend.position = 'none')

## 
## COS threshold: 0.222254471692501
## EUC threshold: 3.30408867238758
## L2E threshold: 0.326703039218849

How do I decide what is a good filter to apply to our data? One idea is to use a priori knowledge of gene distributions in space. For instance, in our example dataset of an organoid, we expect dividing cells to be concentrated near the lumen. Accordingly, filtering should enrich lumenal MKI67 (a marker for cell division).

cowplot::plot_grid( plotlist = list(
  ggplot( spotcalldf[filtout>0 & spotcalldf$g=='MKI67',] ) +
    geom_hex(aes(x=WX, y=WY), bins=100) +
    scale_fill_viridis_c(option='turbo', trans='log10') +
    scale_y_reverse() +
    theme_void( base_size = 14 ) +
    coord_fixed() + ggtitle('Drop'),
  
  ggplot( spotcalldf[filtout==0  & spotcalldf$g=='MKI67',] ) +
    geom_hex(aes(x=WX, y=WY), bins=100) +
    scale_fill_viridis_c(option='turbo', trans='log10') +
    scale_y_reverse() +
    theme_void( base_size = 14 ) +
    coord_fixed() + ggtitle('Keep')
), nrow=1)

Another option is to consider how filtering changes quality control metrics: e.g. a good filter should generally increase correlation between count profiles and previously obtained FPKM data, should decrease the percentage of blanks relative to true genes. We provide a filterDF function that reports how a given filter changes these QC metrics.

params$verbose <- TRUE
spotcalldf <- filterDF(spotcalldf, filtout>0, returnTroubleShootPlots = T, plotWindowRadius = 10)
## 
## Applying filter: filtout > 0...
## N pixels: 2911255 ( 100% ) --> 824288 ( 28.31% )
## N blanks: 871827 ( 29.95% ) --> 207062 ( 25.12% )
## LogFPKM Corr: 0.58239 --> 0.59519
## LogN Probes Corr: 0.52337 --> 0.51121

Perhaps the safest option – in that one that does not require a priori assumptions – is to perform some decoding by eye on a small FOV region, and ensure that one’s spotcalling pipeline returns results that agree with these manually obtained results. To attempt this, masmr provides a returnTroubleShootPlots parameter in filterDF. When set to TRUE (default FALSE), plots are returned showing pixel intensity images for each bit, overlaid with gene labels.

p3 <- troubleshootPlots[['BEFORE']][[1]] + ggtitle('Before filtering')
p4 <- troubleshootPlots[['AFTER']][[1]] + ggtitle('After filtering')

cowplot::plot_grid( plotlist = list(p3, p4))

The next optimization step is to binarize continuous values in our pixels as ON or OFF for each bit, and calculate Hamming distance to the cookbook. I will want to filter pixels that are different by at least 1 Hamming Distance from the expected 4-bit binary encoding in my cookbook.

params$verbose <- FALSE
spotcalldf <- bitsFromDecode(spotcalldf, trainIndex = !spotcalldf$BLANK,
                               fBeta = 2, metricToMaximise = 'fb')
table(spotcalldf$HD)
params$verbose <- TRUE
spotcalldf <- filterDF(spotcalldf, spotcalldf$HD > 1)
## 
## Applying filter: spotcalldf$HD > 1...
## N pixels: 824288 ( 100% ) --> 215762 ( 26.18% )
## N blanks: 207062 ( 25.12% ) --> 46643 ( 21.62% )
## LogFPKM Corr: 0.59519 --> 0.62291
## LogN Probes Corr: 0.51121 --> 0.51129

Binarisation is recommended if there is a clear cutoff in intensities between 0 vs 1 bits in the DECODE images. This assumption should be true in most cases, but may be violated if there are very different numbers of encoding probes for different genes.

Beyond image-codebook vector distances, we also provide other functions that users may find useful for performing further quality control checks:

  1. scoreDeltaF() which calculates recall/precision metrics in the form of delta F scores. I expect low confidence spots to on average have low ΔF/F0, i.e. their ON bits are less distinguishable from their OFF bits. I filter out pixels that are below my threshold of 25th quantile for DFF0 metric and that are non-zero for F0 metric.
params$verbose <- FALSE
spotcalldf <- scoreDeltaF( spotcalldf )
ggplot(spotcalldf[spotcalldf$F0_DECODE!=0,]) +
  geom_violin(aes(x=BLANK, y=DFF0_DECODE, colour=BLANK) ) +
  geom_boxplot(aes(x=BLANK, y=DFF0_DECODE, colour=BLANK), fill='transparent', width=0.25 ) +
  scale_y_log10() +
  theme_minimal(base_size=20) +
  scale_colour_manual(values=c('TRUE' = 'red', 'FALSE' = 'black')) +
  theme(legend.position = 'none')

params$verbose <- TRUE
thresh <- thresh_Quantile( 
  spotcalldf[,'DFF0_DECODE'], 
  label = spotcalldf$BLANK, 
  quantileFalse = 0.25, quantileTrue = 0.25 )
spotcalldf <- filterDF(spotcalldf, spotcalldf[,'DFF0_DECODE'] < thresh & spotcalldf$F0_DECODE!=0)
## 
## Applying filter: spotcalldf[, "DFF0_DECODE"] < thresh & ......
## N pixels: 215762 ( 100% ) --> 165542 ( 76.72% )
## N blanks: 46643 ( 21.62% ) --> 32968 ( 19.92% )
## LogFPKM Corr: 0.62291 --> 0.635
## LogN Probes Corr: 0.51129 --> 0.51739
  1. The spatialIsAdjacent() function searches in the immediate surroundings of a set of query coordinates and asks if a non-blank pixels is next to a blank. The latter pixels are likely to be false positives, and I will remove them.
params$verbose <- FALSE
spotcalldf$NEXTTOBLANK <- spatialIsAdjacent( spotcalldf, 'BLANK' )

ggplot( reshape2::melt( data.frame(spotcalldf)[!spotcalldf$BLANK, ], measure.vars=c('COS', 'EUC', 'L2E'))  ) +
  geom_violin(aes(x=NEXTTOBLANK, y=value, colour=NEXTTOBLANK) ) +
  geom_boxplot(aes(x=NEXTTOBLANK, y=value, colour=NEXTTOBLANK), fill='transparent', width=0.25 ) +
  facet_wrap(~variable, scales='free') +
  ylab('') +
  theme_minimal(base_size=14) +
  scale_colour_manual(values=c('TRUE' = 'red', 'FALSE' = 'black')) +
  theme(legend.position = 'none') +
  ggtitle('Non-blank spots') + xlab('Is next to a blank')

params$verbose <- TRUE
spotcalldf <- filterDF(spotcalldf, spotcalldf$NEXTTOBLANK & !spotcalldf$BLANK)
## 
## Applying filter: spotcalldf$NEXTTOBLANK & !spotcalldf$BLANK...
## N pixels: 165542 ( 100% ) --> 162847 ( 98.37% )
## N blanks: 32968 ( 19.92% ) --> 32968 ( 20.24% )
## LogFPKM Corr: 0.635 --> 0.63057
## LogN Probes Corr: 0.51739 --> 0.50968
  1. spatialHNNDist() function calculates how far apart any two pixels with the same identity are from each other, here called “homotypic nearest neighbour (HNN) distance”. In general, blanks are less likely to be next to each other (possibly because of erroneous single pixel calls in noisy regions), while spots from the same gene tend to cluster - i.e. tend to have low HNN distances (below). I remove pixels that have HNN distance greater than 5 pixels.
params$verbose <- FALSE
spotcalldf$HNNDIST <- spatialHNNDist( spotcalldf )
hnnDistBins <- factor(cut(spotcalldf$HNNDIST, breaks=c(0,1,2,3,4,5,Inf)))
ggplot( data.frame(spotcalldf, hnnDistBins)  ) +
  scale_fill_viridis_d(name = 'HNN distance range (pixels)', option='turbo') +
  geom_bar(aes(x=BLANK, fill=hnnDistBins), position='fill' ) +
  theme_minimal(base_size=14) +
  theme(legend.position = 'right') + ylab('Proportion')

params$verbose <- TRUE
spotcalldf <- filterDF( spotcalldf, spotcalldf$HNNDIST > 5 | is.na(spotcalldf$HNNDIST) )
## 
## Applying filter: spotcalldf$HNNDIST > 5 | is.na(spotcalldf$HNNDIST)...
## N pixels: 162847 ( 100% ) --> 142966 ( 87.79% )
## N blanks: 32968 ( 20.24% ) --> 27012 ( 18.89% )
## LogFPKM Corr: 0.63057 --> 0.63773
## LogN Probes Corr: 0.50968 --> 0.51393
  1. The scoreClassifier() builds a classifier (logistic regression as default) to distinguish one class of pixels from another, using pixel-wise metrics in our dataframe. Users can specify covariates to include in the model.

Below is an example of predicting blanks from non-blanks. I exclude bit-specific columns (e.g. DECODE_01), covariates that are likely to highly correlate with others (e.g. F1_DECODE and DFF0_DECODE are likely to highly correlate), coordinate information, and any covariate that is equivalent to BLANK.

forExclusion <- c('Xm', 'Ym', 'fov', 'g', 'WX', 'WY', 'IDX', colnames(spotcalldf)[grepl('_\\d+$|^DFF0_|^F1_|CV_|LAB$', colnames(spotcalldf))])
probabilityBLANK <- scoreClassifier(spotcalldf, 'BLANK', variablesExcluded = forExclusion)
## 
## Checking which covariates to exclude...
## 
## Formula: OUT ~ COS + EUC + L2E + HD + F0_ORIG + DF_ORIG + F1VAR_ORIG + F0VAR_ORIG + F0_NORM + DF_NORM + F1VAR_NORM + F0VAR_NORM + F0_COARSE + DF_COARSE + F1VAR_COARSE + F0VAR_COARSE + F0_DECODE + DF_DECODE + F1VAR_DECODE + F0VAR_DECODE + F0_MASK + DF_MASK + F1VAR_MASK + F0VAR_MASK + HNNDIST
spotcalldf$PBLANK <- probabilityBLANK

thresh <- thresh_Quantile(spotcalldf$PBLANK, spotcalldf$BLANK, quantileFalse = 0.5, quantileTrue = 0.25)
ggplot( spotcalldf  ) +
  geom_violin(aes(x=BLANK, y=PBLANK, colour=BLANK) ) +
  geom_boxplot(aes(x=BLANK, y=PBLANK, colour=BLANK), fill='transparent', width=0.25 ) +
  ylab('Probability of being a blank') +
  theme_minimal(base_size=14) +
  scale_colour_manual(values=c('TRUE' = 'red', 'FALSE' = 'black')) +
  theme(legend.position = 'none') +
  geom_hline(yintercept=thresh, linetype = 'dashed', colour='grey')

params$verbose <- TRUE
spotcalldf <- filterDF(spotcalldf, spotcalldf$PBLANK > thresh)
## 
## Applying filter: spotcalldf$PBLANK > thresh...
## N pixels: 142966 ( 100% ) --> 73262 ( 51.24% )
## N blanks: 27012 ( 18.89% ) --> 5212 ( 7.11% )
## LogFPKM Corr: 0.63773 --> 0.559
## LogN Probes Corr: 0.51393 --> 0.43978
  1. Lastly, spatialClusterLeiden() collapses clusters of adjacent pixels with the same gene identity into a single punctum. We want clusters at least 3 pixels big, with a maximum distance between pixels of 3 pixels, and that are below the 250 pixel size. After running spatialClusterLeiden(), I visualize the results in a bigger window to manually confirm that adjacent pixels are collapsed to a single puncta.
params$verbose <- FALSE
clusterInfo <- spatialClusterLeiden( spotcalldf, minNeighbours = 3, maxInterSpotDistancePixels = 3)
spotcalldf[,colnames(clusterInfo)] <- clusterInfo
maxClusterSize = 250
## Specify window to plot
cxWindow <- c(1200, 1300, 1200, 1300)

## Using previously established palette (established during prepareCodebook):
genePalette <- params$genePalette

cowplot::plot_grid( plotlist = list(
  ggplot( 
    spotcalldf[
      spotcalldf$WX > cxWindow[1] 
      & spotcalldf$WX < cxWindow[2] 
      & spotcalldf$WY > cxWindow[3] 
      & spotcalldf$WY < cxWindow[4], ]
    ) + 
    geom_text( aes(
      x=WX, y=WY, label=g, 
      colour=factor(g, levels=rownames(params$ordered_codebook))) 
      ) +
    scale_colour_manual( values=genePalette ) +
    theme_void(base_size=14) +
    theme(legend.position = 'none', 
          plot.title = element_text(hjust=0.5)) +
    scale_x_continuous( limits=c(cxWindow[1], cxWindow[2]) ) +
    scale_y_reverse( limits=c(cxWindow[4], cxWindow[3]) ) +
    ggtitle('All pixels'),
  
    ggplot( 
    spotcalldf[
      spotcalldf$CENTROID
      & spotcalldf$WX > cxWindow[1] 
      & spotcalldf$WX < cxWindow[2] 
      & spotcalldf$WY > cxWindow[3] 
      & spotcalldf$WY < cxWindow[4], ]
    ) + 
    geom_text( aes(
      x=WX, y=WY, label=g, 
      colour=factor(g, levels=rownames(params$ordered_codebook))) 
      ) +
    scale_colour_manual( values=genePalette ) +
    theme_void(base_size=14) +
    theme(legend.position = 'none', 
          plot.title = element_text(hjust=0.5)) +
    scale_x_continuous( limits=c(cxWindow[1], cxWindow[2]) ) +
    scale_y_reverse( limits=c(cxWindow[4], cxWindow[3]) ) +
    ggtitle('Centroids only')
  
), nrow=1 )

params$verbose <- TRUE
spotcalldf <- filterDF(spotcalldf, !(spotcalldf$CENTROID & spotcalldf$CLUSTER_SIZE < maxClusterSize))
## 
## Applying filter: !(spotcalldf$CENTROID & spotcalldf$CLUSTER_SIZE < maxClusterSize)...
## N pixels: 73262 ( 100% ) --> 5928 ( 8.09% )
## N blanks: 5212 ( 7.11% ) --> 417 ( 7.03% )
## LogFPKM Corr: 0.559 --> 0.57403
## LogN Probes Corr: 0.43978 --> 0.45322

All this shows that everything is sufficiently optimized on a single FOV.

  1. Run the optimized pipeline on all FOVs 🎯:
checkReadyLoop()
params$verbose <- FALSE

## Start loop
for( fovName in params$fov_names ){
  
  ## Report current FOV / iteration 
  message( paste0(
    '\n', fovName, ': ', 
    which(params$fov_names==fovName), ' of ', 
    length(params$fov_names), '...' ) )
  
  ## Check if file has been created, skip if yes
  if( params$resumeMode ){
    checkFile <- file.exists( paste0(params$out_dir, 'SPOTCALL_', fovName, '.csv.gz') )
    if(checkFile){
      message('FOV has been processed...Skipping...')
      next
    }
  }
  
  ## Keep track of the current_fov
  params$current_fov <- fovName
  
  ## Load current FOV images
  imList <- readImageList( params$current_fov )
  
  ## Check if image is largely empty, if so, skip
  percBlank <- mean( sapply(1:length(imList), function(i){
    sum( imList[[i]] < params$brightness_min[i] ) / length(imList[[i]])
  }) )
  if( percBlank >= 0.95 ){ next }
  
  ## Process an individual image
  getImageMetrics( imList )
  
  ## Register
  forReg <- list()
  for(i in 1:length(imList)){
    forReg[[i]] <- imNormalise( imMetrics$COARSE[[i]] + imMetrics$DECODE[[i]] )
  }
  names(forReg) <- names(imList)
  registerImages(forReg)
  
  ## Consolidate into pixel wise metric dataframe
  spotcalldf <- consolidateImageMetrics()
  cleanUp( c('spotcalldf', cleanSlate) )
  
  ## Decode and annotate blanks
  spotcalldf <- decode(spotcalldf)
  spotcalldf$BLANK <- scoreBlanks( spotcalldf ) >0
  
  ## Binarise and threshold on Hamming distance
  spotcalldf <- bitsFromDecode(spotcalldf, trainIndex = !spotcalldf$BLANK)
  spotcalldf <- filterDF(spotcalldf, spotcalldf$HD>1)
  
  ## Distance-based filtering
  filtout <- rep(0, nrow(spotcalldf))
  for( distanceMetric in c('COS', 'EUC', 'L2E') ){
    thresh <- thresh_Quantile( 
      spotcalldf[,distanceMetric], 
      label = spotcalldf$BLANK, 
      quantileFalse = 0.5, quantileTrue = 0.5 )
    filtout <- filtout + spotcalldf[,distanceMetric] > thresh
  }
  spotcalldf <- filterDF(spotcalldf, filtout>0)
  
  ## Filter nonblanks that are next to blanks
  spotcalldf$NEXTTOBLANK <- spatialIsAdjacent( spotcalldf, 'BLANK' )
  spotcalldf <- filterDF(spotcalldf, spotcalldf$NEXTTOBLANK & !spotcalldf$BLANK)
 
  ## Filter using DFF0 metrics
  spotcalldf <- scoreDeltaF( spotcalldf )
  thresh <- thresh_Quantile( 
    spotcalldf[,'DFF0_DECODE'], 
    label = spotcalldf$BLANK, 
    quantileFalse = 0.25, quantileTrue = 0.25 )
  spotcalldf <- filterDF(spotcalldf, spotcalldf[,'DFF0_DECODE'] < thresh & spotcalldf$F0_DECODE!=0)
  
  ## Filter based on HNN distance
  spotcalldf$HNNDIST <- spatialHNNDist( spotcalldf )
  spotcalldf <- filterDF( spotcalldf, spotcalldf$HNNDIST > 5 | is.na(spotcalldf$HNNDIST) )
  
  ## Filter based on classifier  
  forExclusion <- c('Xm', 'Ym', 'fov', 'g', 'WX', 'WY', 'IDX', 
                    colnames(spotcalldf)[grepl('_\\d+$|^DFF0_|^F1_|CV_|LAB$', colnames(spotcalldf))])
  spotcalldf$PBLANK <- scoreClassifier(spotcalldf, 'BLANK', variablesExcluded = forExclusion)
  thresh <- thresh_Quantile(spotcalldf$PBLANK, spotcalldf$BLANK, quantileFalse = 0.5, quantileTrue = 0.25)
  spotcalldf <- filterDF(spotcalldf, spotcalldf$PBLANK > thresh)
  
  ## Cluster
  clusterInfo <- spatialClusterLeiden( spotcalldf, minNeighbours = 3, maxInterSpotDistancePixels = 3)
  spotcalldf[,colnames(clusterInfo)] <- clusterInfo
  maxClusterSize = 250
  spotcalldf <- filterDF(spotcalldf, !(spotcalldf$CENTROID & spotcalldf$CLUSTER_SIZE < maxClusterSize) )
  saveDF( spotcalldf )
  
  cleanUp()
}

🎉 Congratulations! You have just finished spot calling!

Advanced usage examples covered in the advanced usage handbook: